AITopics | Sarasota

News recommendation systems personalize homepage content to boost engagement, but factors like content type, editorial stance, and geographic focus impact recommendations. Local newspapers balance coverage across regions, yet identifying local articles is challenging due to implicit location cues like slang or landmarks. Traditional methods, such as Named Entity Recognition (NER) and Knowledge Graphs, infer locations, but Large Language Models (LLMs) offer new possibilities while raising concerns about accuracy and explainability. This paper explores LLMs for local article classification in Taboola's "Homepage For You" system, comparing them to traditional techniques. Key findings: (1) Knowledge Graphs enhance NER models' ability to detect implicit locations, (2) LLMs outperform traditional methods, and (3) LLMs can effectively identify local content without requiring Knowledge Graph integration. Offline evaluations showed LLMs excel at implicit location classification, while online A/B tests showed a significant increased in local views. A scalable pipeline integrating LLM-based location classification boosted local article distribution by 27%, preserving newspapers' brand identity and enhancing homepage personalization.

newspaper, wang, zhang, (12 more...)

arXiv.org Artificial Intelligence

2502.1466

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.05)
North America > United States > Florida > Sarasota County > Sarasota (0.05)
(10 more...)

Genre: Research Report > Experimental Study (0.69)

Industry:

Media > News (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Uncovering Attacks and Defenses in Secure Aggregation for Federated Deep Learning

Zhang, Yiwei, Behnia, Rouzbeh, Yavuz, Attila A., Ebrahimi, Reza, Bertino, Elisa

arXiv.org Artificial IntelligenceOct-17-2024

Federated learning enables the collaborative learning of a global model on diverse data, preserving data locality and eliminating the need to transfer user data to a central server. However, data privacy remains vulnerable, as attacks can target user training data by exploiting the updates sent by users during each learning iteration. Secure aggregation protocols are designed to mask/encrypt user updates and enable a central server to aggregate the masked information. MicroSecAgg (PoPETS 2024) proposes a single server secure aggregation protocol that aims to mitigate the high communication complexity of the existing approaches by enabling a one-time setup of the secret to be re-used in multiple training iterations. In this paper, we identify a security flaw in the MicroSecAgg that undermines its privacy guarantees. We detail the security flaw and our attack, demonstrating how an adversary can exploit predictable masking values to compromise user privacy. Our findings highlight the critical need for enhanced security measures in secure aggregation protocols, particularly the implementation of dynamic and unpredictable masking strategies. We propose potential countermeasures to mitigate these vulnerabilities and ensure robust privacy protection in the secure aggregation frameworks.

artificial intelligence, iteration, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.09676

Country:

North America > United States > Florida > Sarasota County > Sarasota (0.04)
North America > United States > Florida > Hillsborough County > University (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

How Meteorologists Are Using AI to Forecast Hurricane Milton and Other Storms

TIME - TechOct-9-2024, 21:23:22 GMT

On Wednesday evening, Hurricane Milton will become the fifth hurricane in 2024 to make landfall in the mainland U.S. As storms like this one grow more frequent and intense, artificial intelligence is playing an increasingly central role in efforts by meteorologists and other scientists to track these storms and mitigate their harms. For years, meteorologists have built complex forecasting models of storms based on wind speeds, temperature, humidity and other factors, and recorded via readings from planes, buoys and satellites. But these models can take hours to produce updated forecasts. Machine learning models, on the other hand, draw upon vast knowledge of the earth's atmosphere and data from how previous storms have unfolded. They excel at pattern recognition, teasing out trends that most humans can't discern in a fraction of the time.

hurricane milton, lanza, modeling, (13 more...)

TIME - Tech

Country:

North America > United States > Texas (0.06)
North America > United States > Florida > Sarasota County > Sarasota (0.05)
Europe > United Kingdom (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Enhancing Q&A Text Retrieval with Ranking Models: Benchmarking, fine-tuning and deploying Rerankers for RAG

Moreira, Gabriel de Souza P., Ak, Ronay, Schifferer, Benedikt, Xu, Mengyao, Osmulski, Radek, Oldridge, Even

arXiv.org Artificial IntelligenceSep-11-2024

Ranking models play a crucial role in enhancing overall accuracy of text retrieval systems. These multi-stage systems typically utilize either dense embedding models or sparse lexical indices to retrieve relevant passages based on a given query, followed by ranking models that refine the ordering of the candidate passages by its relevance to the query. This paper benchmarks various publicly available ranking models and examines their impact on ranking accuracy. We focus on text retrieval for question-answering tasks, a common use case for Retrieval-Augmented Generation systems. Our evaluation benchmarks include models some of which are commercially viable for industrial applications. We introduce a state-of-the-art ranking model, NV-RerankQA-Mistral-4B-v3, which achieves a significant accuracy increase of ~14% compared to pipelines with other rerankers. We also provide an ablation study comparing the fine-tuning of ranking models with different sizes, losses and self-attention mechanisms. Finally, we discuss challenges of text retrieval pipelines with ranking models in real-world industry applications, in particular the trade-offs among model size, ranking accuracy and system requirements like indexing and serving latency / throughput.

arxiv preprint arxiv, pipeline, ranking model, (13 more...)

arXiv.org Artificial Intelligence

2409.07691

Country:

Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
South America > Brazil > São Paulo (0.04)
Oceania > Australia > Queensland > Brisbane (0.04)
(4 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Automatic identification of the area covered by acorn trees in the dehesa (pastureland) Extremadura of Spain

Benjamin, Ojeda-Magaña, Ruben, Ruelas, Joel, Quintanilla-Dominguez, Leopoldo, Gomez-Barba, Juan, Lopez de Herrera, Jose, Robledo-Hernandez, Ana, Tarquis

arXiv.org Artificial IntelligenceAug-7-2024

The acorn is the fruit of the oak and is an important crop in the Spanish dehesa extreme\~na, especially for the value it provides in the Iberian pig food to obtain the "acorn" certification. For this reason, we want to maximise the production of Iberian pigs with the appropriate weight. Hence the need to know the area covered by the crowns of the acorn trees, to determine the covered wooded area (CWA, from the Spanish Superficie Arbolada Cubierta SAC) and thereby estimate the number of Iberian pigs that can be released per hectare, as indicated by the royal decree 4/2014. In this work, we propose the automatic estimation of the CWA, through aerial digital images (orthophotos) of the pastureland of Extremadura, and with this, to offer the possibility of determining the number of Iberian pigs to be released in a specific plot of land. Among the main issues for automatic detection are, first, the correct identification of acorn trees, secondly, correctly discriminating the shades of the acorn trees and, finally, detect the arbuscles (young acorn trees not yet productive, or shrubs that are not oaks). These difficulties represent a real challenge, both for the automatic segmentation process and for manual segmentation. In this work, the proposed method for automatic segmentation is based on the clustering algorithm proposed by Gustafson-Kessel (GK) but the modified version of Babuska (GK-B) and on the use of real orthophotos. The obtained results are promising both in their comparison with the real images and when compared with the images segmented by hand. The whole set of orthophotos used in this work correspond to an approximate area of 142 hectares, and the results are of great interest to producers of certified "acorn" pork.

automatic segmentation, orthophoto, segmentation, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.compag.2020.105289

2408.03542

Country:

North America > Mexico (0.05)
Europe > Spain > Galicia > Madrid (0.05)
North America > United States > New York (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)

Add feedback

NV-Retriever: Improving text embedding models with effective hard-negative mining

Moreira, Gabriel de Souza P., Osmulski, Radek, Xu, Mengyao, Ak, Ronay, Schifferer, Benedikt, Oldridge, Even

arXiv.org Artificial IntelligenceJul-22-2024

Text embedding models have been popular for information retrieval applications such as semantic search and Question-Answering systems based on Retrieval-Augmented Generation (RAG). Those models are typically Transformer models that are fine-tuned with contrastive learning objectives. Many papers introduced new embedding model architectures and training approaches, however, one of the key ingredients, the process of mining negative passages, remains poorly explored or described. One of the challenging aspects of fine-tuning embedding models is the selection of high quality hard-negative passages for contrastive learning. In this paper we propose a family of positive-aware mining methods that leverage the positive relevance score for more effective false negatives removal. We also provide a comprehensive ablation study on hard-negative mining methods over their configurations, exploring different teacher and base models. We demonstrate the efficacy of our proposed methods by introducing the NV-Retriever-v1 model, which scores 60.9 on MTEB Retrieval (BEIR) benchmark and 0.65 points higher than previous methods. The model placed 1st when it was published to MTEB Retrieval on July 07, 2024.

arxiv preprint arxiv, mining method, teacher model, (13 more...)

arXiv.org Artificial Intelligence

2407.15831

Country:

Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
South America > Brazil > São Paulo (0.04)
Oceania > Australia > Queensland > Brisbane (0.04)
(5 more...)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.51)

Add feedback

Meta-Task Planning for Language Agents

Zhang, Cong, Deik, Derrick Goh Xin, Li, Dexun, Zhang, Hao, Liu, Yong

arXiv.org Artificial IntelligenceMay-30-2024

The rapid advancement of neural language models has sparked a new surge of intelligent agent research. Unlike traditional agents, large language model-based agents (LLM agents) have emerged as a promising paradigm for achieving artificial general intelligence (AGI) due to their superior reasoning and generalization capabilities. Effective planning is crucial for the success of LLM agents in real-world tasks, making it a highly pursued topic in the community. Current planning methods typically translate tasks into executable action sequences. However, determining a feasible or optimal sequence for complex tasks at fine granularity, which often requires compositing long chains of heterogeneous actions, remains challenging. This paper introduces Meta-Task Planning (MTP), a zero-shot methodology for collaborative LLM-based multi-agent systems that simplifies complex task planning by decomposing it into a hierarchy of subordinate tasks, or meta-tasks. Each meta-task is then mapped into executable actions. MTP was assessed on two rigorous benchmarks, TravelPlanner and API-Bank. Notably, MTP achieved an average $\sim40\%$ success rate on TravelPlanner, significantly higher than the state-of-the-art (SOTA) baseline ($2.92\%$), and outperforming $LLM_{api}$-4 with ReAct on API-Bank by $\sim14\%$, showing the immense potential of integrating LLM with multi-agent systems.

agent, constraint, information, (15 more...)

arXiv.org Artificial Intelligence

2405.1651

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
North America > United States > California > San Francisco County > San Francisco (0.05)
North America > United States > New York (0.04)
(7 more...)

Genre:

Workflow (0.66)
Research Report (0.64)

Industry: Consumer Products & Services > Travel (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Low-rank matrix reconstruction and clustering via approximate message passing

Neural Information Processing SystemsMar-13-2024, 16:47:32 GMT

We study the problem of reconstructing low-rank matrices from their noisy observations. We formulate the problem in the Bayesian framework, which allows us to exploit structural properties of matrices in addition to low-rankedness, such as sparsity. We propose an efficient approximate message passing algorithm, derived from the belief propagation algorithm, to perform the Bayesian inference for matrix reconstruction. We have also successfully applied the proposed algorithm to a clustering problem, by reformulating it as a low-rank matrix reconstruction problem with an additional structural property. Numerical experiments show that the proposed algorithm outperforms Lloyd's K-means algorithm.

algorithm, amp algorithm, matrix reconstruction, (13 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
North America > United States > New York > New York County > New York City (0.04)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback